3574 results found.
Written
Evaluation Data,
Language Type:
Multilingual
Languages:
Basque Bulgarian Danish Dutch English Estonian German Hungarian Irish Italian Portuguese Russian Serbian Slovenian Spanish
Availability:
Freely Available
License:
Size:
3 MByte Production Status:
Newly created-in progress
Use:
Lexicon Creation/Annotation
-
Paper title:A Multilingual Evaluation Dataset for Monolingual Word Sense Alignment
-
Paper track:Evaluation/oral presentation
-
Paper status:Accept Poster
| Author Number | Name | Affiliation | Country |
|---|---|---|---|
| Main Contact | Sina Ahmadi | Monolingual Word Sense Alignment | /N |
Documentation:
None
Written
Corpus,
Language Type:
Monolingual
Languages:
English
Availability:
Freely Available
License:
Creative Commons Attribution-NonCommerial-ShareAlike 4.0 International License.
Size:
1432 sentences Production Status:
Newly created-in progress
Use:
Word Sense Disambiguation
-
Paper title:SeCoDa: Sense Complexity Dataset
-
Paper track:Evaluation/oral presentation
-
Paper status:Accept Oral
| Author Number | Name | Affiliation | Country |
|---|---|---|---|
| Main Contact | David Strohmaier | SeCoDa | /N |
Documentation:
paper describes creation and format of dataset
Written
Corpus,
Language Type:
Monolingual
Languages:
English Swedish deu
Spanish ekk
Finnish eng
German esp
Estonian fin
French fra
Hungarian hun
Italian ita
Latvian lav
Dutch nld
Polish pol
Portuguese por
Romanian ron
Slovak slk
Slovenian slv
Swedish
Availability:
Freely Available
License:
OpenSource
Size:
8 MByte Production Status:
Newly created-finished
Use:
Anaphora, Coreference
-
Paper title:Exploiting Cross-Lingual Hints to Discover Event Pronouns
-
Paper track:Evaluation/poster presentation
-
Paper status:Accept Poster
| Author Number | Name | Affiliation | Country |
|---|---|---|---|
| Main Contact | Sharid Loáiciga | Event it Pronouns | /N |
Documentation:
yes, English, publicly available
Written
Ontology,
Language Type:
Multilingual
Languages:
Chinese and many others Dutch English French German Italian Japanese Portuguese Russian Spanish
Availability:
Freely Available
License:
ConceptNet 5 is licensed under a Creative Commons Attribution-ShareAlike 4.0 International License
Size:
34 million concepts Production Status:
Existing-updated
Use:
Semantic Web
-
Paper title:Using Crowdsourced Exercises for Vocabulary Training to Expand ConceptNet
-
Paper track:Evaluation/oral presentation
-
Paper status:Accept Oral
| Author Number | Name | Affiliation | Country |
|---|---|---|---|
| Main Contact | Christos Rodosthenous | ConceptNet 5 | /N |
Documentation:
https://github.com/commonsense/conceptnet5/wiki
Written
Corpus,
Language Type:
Bilingual
Languages:
English Maltese
Availability:
From Owner
License:
Size:
81944 words Production Status:
Newly created-finished
Use:
Hate speech
-
Paper title:Annotating for Hate Speech: The MaNeCo Corpus and Some Input from Critical Discourse Analysis
-
Paper track:Written/oral presentation
-
Paper status:Accept Poster
| Author Number | Name | Affiliation | Country |
|---|---|---|---|
| Main Contact | Stavros Assimakopoulos | CONTACT annotated dataset for Malta | /N |
Documentation:
None
Written
Corpus,
Language Type:
Bilingual
Languages:
English Maltese
Availability:
Not Applicable
License:
Size:
124000000 words Production Status:
Newly created-in progress
Use:
Hate speech
-
Paper title:Annotating for Hate Speech: The MaNeCo Corpus and Some Input from Critical Discourse Analysis
-
Paper track:Written/oral presentation
-
Paper status:Accept Poster
| Author Number | Name | Affiliation | Country |
|---|---|---|---|
| Main Contact | Stavros Assimakopoulos | MaNeCo corpus | /N |
Documentation:
None
Written
Corpus,
Language Type:
Monolingual
Languages:
English
Availability:
From Data Center(s)
License:
Size:
438 articles OtherProduction Status:
Newly created-in progress
Use:
Information Extraction, Information Retrieval
-
Paper title:Temporal Histories of Epidemic Events (THEE): A Case Study in Temporal Annotation for Public Health
-
Paper track:Written/oral presentation
-
Paper status:Accept Poster
| Author Number | Name | Affiliation | Country |
|---|---|---|---|
| Main Contact | Jingcheng Niu | TheeBank | /N |
Documentation:
There will be a publicly available annotation standard (written in English) provided with the corpus
Written
Corpus,
Language Type:
Monolingual
Languages:
English
Availability:
Freely Available
License:
Not yet deceided
Size:
9.47 Mbytes, details of 446 authors and 9,399 publications MByte Production Status:
Newly created-finished
Use:
Information Extraction, Information Retrieval
-
Paper title:Exploiting Citation Knowledge in Personalised Recommendation of Recent Scientific Publications
-
Paper track:Written/oral presentation
-
Paper status:Accept Poster
| Author Number | Name | Affiliation | Country |
|---|---|---|---|
| Main Contact | Miriam Fernandez | Citation Knowledge based Dataset | /N |
Documentation:
English (available in the same URL than the resource)
Multimodal/Multimedia
Corpus,
Language Type:
Monolingual
Languages:
English
Availability:
From Owner
License:
Yes, work in progress
Size:
100 GByte Production Status:
Newly created-finished
Use:
Speech Recognition/Understanding
-
Paper title:Introducing MULAI: A Multimodal Database of Laughter during Dyadic Interactions
-
Paper track:Multimodality/poster presentation
-
Paper status:Accept Poster
| Author Number | Name | Affiliation | Country |
|---|---|---|---|
| Main Contact | Michel-Pierre Jansen | MULAI: A Multimodal Database of Laughter during Dyadic Interactions | /N |
Documentation:
Will be publicly available.
Written
Corpus,
Language Type:
Monolingual
Languages:
English
Availability:
Freely Available
License:
BNC XML Edition licence
Size:
100 million words Production Status:
Existing-used
Use:
Document Classification, Text categorisation
-
Paper title:The Connection between the Text and Images of News Articles: New Insights for Multimedia Analysis
-
Paper track:Multimodality/poster presentation
-
Paper status:Accept Poster
| Author Number | Name | Affiliation | Country |
|---|---|---|---|
| Main Contact | Martha Larson | British National Corpus, version 3 (BNC XML edition) | /N |
Documentation:
None




